Reconstruction of damaged spectrographic features for robust speech recognition

نویسندگان

  • Bhiksha Raj
  • Michael L. Seltzer
  • Richard M. Stern
چکیده

We present two missing-feature based algorithms that recover noise-corrupted regions of spectrographic representations of speech for noise-robust speech recognition. These algorithms modify the incoming feature vector without any changes to the speech recognition system, in contrast to previously-described approaches. The first approach clusters the feature vectors representing clean speech. Missing data are recovered by estimating the spectral cluster in each analysis frame based on the uncor-rupted feature values. The second approach uses MAP procedures to estimate the values of missing data elements based on their correlations with the features that are present. Both methods take into account bounds on the clean spectrogram implied by the noisy spectrogram. Large improvements in recognition accuracy are observed when these methods are used on speech corrupted by non-stationary noise when the locations of the corrupt regions of the spectrogram are known. We also present a new method of estimating the locations of corrupt regions in spectrograms that treats the problem of identifying these regions as one of Bayesian classification. This method, when used along with the best method to reconstruct them, results in recognition accuracies comparable with the best previous data compensation algorithm on speech corrupted by white noise. It also provides significant improvement on speech corrupted by music when the global SNR of the corrupted signal is known a priori .

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Information-Theoretic Discussion of Convolutional Bottleneck Features for Robust Speech Recognition

Convolutional Neural Networks (CNNs) have been shown their performance in speech recognition systems for extracting features, and also acoustic modeling. In addition, CNNs have been used for robust speech recognition and competitive results have been reported. Convolutive Bottleneck Network (CBN) is a kind of CNNs which has a bottleneck layer among its fully connected layers. The bottleneck fea...

متن کامل

Reconstruction of missing features for robust speech recognition

Speech recognition systems perform poorly in the presence of corrupting noise. Missing feature methods attempt to compensate for the noise by removing noise corrupted components of spectrographic representations of noisy speech and performing recognition with the remaining reliable components. Conventional classifier-compensation methods modify the recognition system to work with the incomplete...

متن کامل

روشی جدید در بازشناسی مقاوم گفتار مبتنی بر دادگان مفقود با استفاده از شبکه عصبی دوسویه

Performance of speech recognition systems is greatly reduced when speech corrupted by noise. One common method for robust speech recognition systems is missing feature methods. In this way, the components in time - frequency representation of signal (Spectrogram) that present low signal to noise ratio (SNR), are tagged as missing and deleted then replaced by remained components and statistical ...

متن کامل

Robust Speech Recognition: The case for restoring missing features

Speech recognition systems perform poorly in the presence of corrupting noise. Missing feature methods attempt to compensate for the noise by removing unreliable noise corrupted components of a spectrographic representation of the noisy speech and performing recognition with the remaining reliable components. Conventional classifiercompensation methods modify the recognition system to work with...

متن کامل

Missing Feature Imputation of Log-spectral Data for Noise Robust Asr

In this paper, we present a missing feature (MF) imputation algorithm for log-spectral data with applications to noise robust ASR. Drawing from previous work [1], we adapt the previously proposed spectrographic reconstruction solution to the liftered log-spectral domain by introducing log-spectral flooring (LS-FLR). LS-FLR is shown to be an efficient and effective noise robust feature extractio...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000